Skip to content

First Version of Intel PCT Getting Started#37

Open
louie-tsai wants to merge 4 commits into
intel:mainfrom
intel-ai-tce:pct
Open

First Version of Intel PCT Getting Started#37
louie-tsai wants to merge 4 commits into
intel:mainfrom
intel-ai-tce:pct

Conversation

@louie-tsai

@louie-tsai louie-tsai commented Jun 23, 2026

Copy link
Copy Markdown

Guide users how to check PCT status and configure PCT for their workloads such as AI inference using vLLM

The Getting Started matches the contents of Intel PCT tech article.
https://www.intel.com/content/www/us/en/content-details/846906/priority-core-turbo-technology-pct-technology-technical-article.html

Signed-off-by: louie-tsai <louie.tsai@intel.com>
Added a frequency diagram for Xeon 6776P with PCT on and updated the PCT validation instructions.
@louie-tsai

Copy link
Copy Markdown
Author

@rsiyer-intel
please help to review the PR. thanks

@rsiyer-intel rsiyer-intel requested a review from adgubrud June 25, 2026 01:10
Comment thread hardware/priority_core_turbo/README.md Outdated
✅ CLOS0 CPU count exactly matches the bucket-0 PCT logical budget.
```

## 4. Benchmark CLOS0 CPUs with Host PerfSpect

@rsiyer-intel rsiyer-intel Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide link to PerfSpect, users may not be aware of this tool https://github.com/intel/PerfSpect
When you say Host PerfSpect, please clarify it as "Benchmark CLOS0 CPUs with PerfSpect tool on the host" (there is no separate Host and Container PerfSpect)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I changed title accordingly.
I also have how to install perfspect in folded Details section.
Hope the information works.
image

docker compose --progress=plain --profile set up --abort-on-container-exit
docker compose --progress=plain --profile check up --abort-on-container-exit

./run_host_perfspect_benchmark.sh

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this a link to the script.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't add link in code snapshot section, so added the link in the sentence above code snapshot section. hope it works

This is the frequency diagram on Xeon 6776P with PCT on.
<img width="835" height="483" alt="image" src="https://github.com/user-attachments/assets/96f8855c-4b83-4c62-a0dd-fa2408f979fb" />

This is the expected pattern: small active core counts hold the highest PCT turbo

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify that the frequency reported is the frequency when "X" number of cores are active. Can you elaborate what the user needs to understand from above image? Should they be looking at 1-8 cores being active? Are you referring to the SSE frequencies ? They should be ignoring SSE(expected) row.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frequency steps down as more cores are active, because the average of all the cores is taken ? CLOS0 cores will always operate at higher frequency, no matter how many cores are active?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we use CPU as head node, we normally don't use AVX512 or AMX on CPUs for GPU workload, so we look into SSE. more CPU cores are active and frequency will go down. CLOS0 CPU still share the power with other CPUs in the same power domain, so CLOS0 CPU frequency will also go down when CLOS2/3 CPUs start active. You could see frequency when CPU Core count > 8, frequency drops when it starts using CLOS2/3 CPUs.

Comment thread hardware/priority_core_turbo/README.md Outdated

## Overview

**Intel® Priority Core Turbo (PCT)** is part of **Intel® Speed Select Technology – Turbo Frequency (SST-TF)**.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have any public links to Intel Speed Select Technology, please add it here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea. added links for PCT, SST, and SST-TF. thanks


### PCT bucket-count interpretation

`intel-speed-select turbo-freq info -l <level>` may print the same `bucket-0`,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share how user can download or get the intel-speed-select tool ? I dont think it is covered prior to this.
And show a sample output of info -l ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that intel-speed-select is part of the Dockerfile. We should probably have this section after the build section and after tool is introduced.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general it would be good to have a section preceding PCT bucket-count interpretation that discusses the different software projects that are required/available for PCT, where to get them, and how to install them (even if it's just a reference to scripts used later in the document).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rsiyer-intel feel better to explain PCT first, and then help on env setup later. added a refer link to build section in the beginning of the seciton.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adgubrud we interpretate for users and then set it according to numa topology in pct_map_and_set_clos.sh there are just some background knowledge. users don't need to do anything themselves for that interpretation. hope it address your quesiton

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@louie-tsai I think I understand the SW requirements a bit better now and my previous comment doesn't apply anymore.

What I'm trying to get at is that a user will likely be coming into this article with no understanding of how to achieve the steps you're outlining. If you mention in the overview that this guide provides a Docker container and supporting scripts without any extra software dependencies, it will be helpful to provide that context.

PCT_TOTAL_LOGICAL_CPUS=32
```

But the `turbo-freq` output shows two reporting anchors per package:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be good to specify that this is a 2S 64c per socket - Intel Xeon 6776P system.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it accordingly

Export the kernel build variables first:

```bash
source ./set_kernel_env.sh

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention minimum kernel requirement in the main README itself ?
echo "WARN: validated GNR PCT flow expects KERNEL_MM=6.8 and KERNEL_TAG=v6.8." >&2

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, no minimum kernel. 6.8 just default one. when user runs set_kernel_env.sh, it will update env based on their kernel version.

Comment thread hardware/priority_core_turbo/README.md Outdated

This is particularly effective for **GPU-accelerated AI inference**, where a small number of CPU threads handle
**latency-critical, mostly serial tasks** such as tokenization, scheduling, and feeding GPUs.
Running these threads on **High-Priority (HP) cores** improves GPU utilization, TTFT, and tail latency.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include the expanded version of TTFT. It should be time-to-first-token, right?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. added it accordingly. thanks


### PCT bucket-count interpretation

`intel-speed-select turbo-freq info -l <level>` may print the same `bucket-0`,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general it would be good to have a section preceding PCT bucket-count interpretation that discusses the different software projects that are required/available for PCT, where to get them, and how to install them (even if it's just a reference to scripts used later in the document).

Comment thread hardware/priority_core_turbo/README.md Outdated
### PCT bucket-count interpretation

`intel-speed-select turbo-freq info -l <level>` may print the same `bucket-0`,
`bucket-1`, and `bucket-2` SST-TF table under multiple `powerdomain-*` anchors.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include a quick description of what a powerdomain anchor is

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it accordingly


For Intel® Xeon® 6776P system, `bucket-0` reports:

```text

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include the command that produces this output

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is output from check_status script. added some explanation there.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my alternate suggestion above


But the `turbo-freq` output shows two reporting anchors per package:

```text

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include the command that produces this output

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is parsed results. no direct command to get it. we handle it all in our script.

Comment thread hardware/priority_core_turbo/README.md Outdated
0-3,32-35,64-67,96-99,128-131,160-163,192-195,224-227
```

This is the default strict bucket-0 PCT placement used by the updated set script.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the updated set script "set_kernel_env.sh"? Please link to whichever script you're mentioning.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence is not required. removed it accordingly.

Comment thread hardware/priority_core_turbo/README.md Outdated

<details>
<summary> Debug Details </summary>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there common pitfalls you expect the user to run into? What will these useful commands help with?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. it is just an optional section for further debugging purpose. in general, users don't need to run it themselves.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay - what about a few word description of what info each command will provide? See below

| **PCT capacity** | Count `bucket-0` once per package/socket |
| **HP CPU placement** | Dispatch the package-level PCT core budget across the package's PCT reporting powerdomain anchors |

For Intel® Xeon® 6776P system with 2 sockets and 64 cores per socket, `bucket-0` reports using check_pct_status.sh in [check-pct-status session](#2-check-pct-status):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to rephrase like this:

Suggested change
For Intel® Xeon® 6776P system with 2 sockets and 64 cores per socket, `bucket-0` reports using check_pct_status.sh in [check-pct-status session](#2-check-pct-status):
The excerpts below are from execution of (check_pct_status.sh](./check_pct_status.sh) on an Intel® Xeon® 6776P system with 2 sockets and 64 cores per socket. Subsequent sections explain the outputs and how to interpret them.

Comment on lines +472 to +477
```bash
intel-speed-select --info
intel-speed-select turbo-freq info -l 1
intel-speed-select core-power info
intel-speed-select -c 0 core-power get-assoc
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```bash
intel-speed-select --info
intel-speed-select turbo-freq info -l 1
intel-speed-select core-power info
intel-speed-select -c 0 core-power get-assoc
```
```bash
intel-speed-select --info # what info will this give?
intel-speed-select turbo-freq info -l 1 # what info will this give?
intel-speed-select core-power info # what info will this give?
intel-speed-select -c 0 core-power get-assoc # what info will this give

@adgubrud

Copy link
Copy Markdown
Contributor

Please add an entry to the Optimization Zone main README under the Hardware section: https://github.com/intel-ai-tce/optimization-zone/tree/pct#table-of-contents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants